Synthesis of hybrid polynucleotide molecules using single-stranded polynucleotide molecules

ABSTRACT

A method to make libraries of hybrid polynucleotide molecules of two parental polynucleotide molecules utilizing single-stranded DNA was invented. Example of the method comprises several steps: (i) preparation of two single-stranded polynucleotide molecules comprising sequences containing one or more parts of homology and one or more parts of heterology, (ii) random or non-random fragmentation of said polynucleotides, (iii) hybridization of the fragmented molecules followed by de novo polynucleotide synthesis (i.e. polynucleotide chain elongation) on the hybridized molecules, (iv) separation of the chain elongation products (i.e. double-stranded polynucleotide molecules) into single-stranded polynucleotide molecules (denaturation) (v) hybridization of the resultant single-stranded polynucleotide molecules followed by de novo polynucleotide synthesis on the hybridized molecules, and (vi) repeating at least two further cycles of steps (iv) and (v).

BACKGROUND OF THE INVENTION

[0001] 1. Field of the Invention

[0002] The present invention relates to a method for the production ofhybrid polynucleotide molecules from two parental polynucleotidemolecules using single-stranded polynucleotides, which can achievesuperior effects to the existing methods, such as a higher frequencies.

[0003] 2. Description of the Related Art

[0004] DNA shuffling is a developed technique that allows acceleratedand directed protein evolution in vitro. In this method, the acquisitionof genes encoding improved proteins is done in two steps. In the firststep, a single gene is mutagenized, and desired mutant genes areselected. In the second step, the mutant genes are fragmented by DNaseI, and subsequently recombined in vitro by using PCR. Among therecombinants, (i.e., the products of DNA shuffling), those producingmost favored proteins are isolated (FIG. 1; 1, 2). The modified versionsof the DNA shuffling exist: (i) random priming was used to generate DNAfragment instead of the DNase I digestion (3); (ii) PCR conditions withvery short annealing/extension steps were employed to increase thefrequency of recombination (4). Using these techniques, a number ofimproved enzymes have been obtained (1-9). When the DNA shuffling isdone using a set of homologous genes instead of a set of mutant genesderived from a single gene, this technique is called family shuffling.Family shuffling utilizes naturally occurring nucleotide substitutionsamong family genes as the driving force for the in vitro evolution. Theapplication of the family shuffling strategy has also provided manysuccessful examples (10-15).

[0005] A potential problem of the family shuffling is a low yield ofrecombinants (i.e., hybrid molecules constituted from several familygene sequences). When two parental genes of 80% nucleotide sequenceidentity were shuffled, the frequency of hybrid formation was less than1% (16-17). The low recombination yield may be due to a lower frequencyof the heteroduplex formation compared to the frequency of thehomoduplex formation (FIG. 2; 16).

[0006] Thus, it is desirable to develop methods which allow for the highfrequency production of recombinant molecules of two or more familygenes. We have reported a technique which is the shuffling ofrestriction endonuclease-digested DNA fragments instead of the shufflingof randomly fragmented DNA (16). The annealing of endonuclease-digestedDNA fragments would produce homoduplex at a high frequency, butsignificant DNA elongation only occurs on the heteroduplex molecules.

SUMMARY OF THE INVENTION

[0007] Contemporary genes belonging to the same gene family are derivedfrom a single ancestral gene after repeated introduction of mutationsthrough the natural divergent evolution processes (FIG. 3). Theshuffling of the family gene sequences creates a library of chimericgenes which would yield gene products of diverse properties. Among thediverse propertied expressed from the chimeric genes, desired ones couldbe selected.

[0008] In the shuffling of family genes in conventional methods (1-11,13-15, 18), two types of annealing occur: homoduplex (annealing ofstrands derived from the same gene) and heteroduplex (annealing ofstrands derived from two different genes) (FIG. 2). If the frequency ofthe homoduplex formation is higher than that of the heteroduplexformation, the frequency of the regeneration of the original genesequences through PCR reactions occurs preferentially rather than theformation of chimeric genes. The consequence of it is a low yield ofrecombinant molecules of two or more family genes.

[0009] Now, we invented a new technique which uses single-strandedpolynucleotide sequences (17). In this method, single-srandedpolynucleotides of two family genes are prepared. One single-strandedpolynucleotide is the coding strand of one gene while anothersingle-stranded polynucleotides is the non-coding strand of anothergene. These two single-stranded polynucleotide molecules are fragmentedinto appropriate sizes, and their fragments are used for the familyshuffling. This technique is preferable in that, in the first round ofhybridization, the homoduplex formation is prevented.

[0010] Based on this new technique, we completed the present inventionwhich can provide hybrid polynucleotide molecules from parentalpolynucleotides containing one or more parts of homology and one or moreparts of heterology (e.g., family genes).

[0011] In this invention, two types of single-stranded polynucleotidesare prepared from two parent polynucleotids (for example, two familygenes, two genes obtained by random mutagenesis of a parental gene,etc.), and used for the in vitro polynucleotide synthesis. The firsttype is the single-stranded polynucleotide molecule corresponding to thecoding strand of, for example, one family gene while the second type isthe single-stranded molecule corresponding to the non-coding strand of,for example, another family gene. In other words, these two types of thesingle-stranded polynucleotide molecules are complementary to each otherbut have one or more parts which are not complementary (the first-typemolecule comprises one or more parts of homology and one or more partsof heterology to the complementary sequence of the second-typemolecule). These single-stranded polynucleotide molecules are fragmentedinto appropriate sizes, and hybridized. In this procedure, no homoduplexmolecule is formed, and when appropriate enzyme(s) exists, thenucleotide elongation can occur on heteroduplex molecules formingchimeric polynucleotide pieces. (FIG. 4) Subsequently, the steps of thedenaturation followed by hybridization and polynucleotide synthesiscould be done repeatedly as summarized in FIG. 1.

[0012] This method can be combined to appropriate methods forintroducing one or more mutations into chimeric genes.

[0013] Thus, the present invention provides inter alia:

[0014] (1) A method for making libraries of hybrid polynucleotidemolecules in which double-stranded polynucleotide molecules are not usedas starting materials.

[0015] (2) The method of (1) above, wherein two types of single-strandedpolynucleotide molecules are used as starting materials and wherein thefirst-type molecule comprises stretches of sequences containing one ormore parts of homology and one or more parts of heterology to thecomplementary sequence of the second-type molecule.

[0016] (3) The method of (2) above, wherein the single-strandedpolynucleotide molecules are fragmented and used as templates for denovo polynucleotide synthesis to create hybrid polynucleotide molecules.

[0017] (4) The method of (2), wherein mutations are introduced intohybrid polynucleotide molecules prior, during or after the production ofthe hybrid polynucleotide molecules.

[0018] (5) A method for making libraries of hybrid polynucleotidemolecules, which comprises:

[0019] (i) preparing two single-stranded polynucleotide moleculescomprising sequences which are complementary to each other,

[0020] (ii) randomly or non-randomly fragmenting the two single-strandedpolynucleotide molecules,

[0021] (iii) incubating the fragmented molecules under conditions suchthat hybridization of fragmented polynucleotide molecules occurs and denovo polynucleotide synthesis on the hybridized molecules occurs,

[0022] (iv) denaturing the resultant elongated double-strandedpolynucleotide molecules into single-stranded polynucleotide molecules,

[0023] (v) incubating the resultant single-stranded polynucleotidemolecules under conditions such that hybridization of single-strandedpolynucleotide molecules occurs and de novo polynucleotide synthesis onthe hybridized molecules occurs, and

[0024] (vi) repeating at least two further cycles of steps (iv) and (v).

BRIEF DESCRIPTION OF THE DRAWINGS

[0025] A more complete appreciation of the invention and many of theattendant advantages thereof will be readily obtained as the same becomebetter understood by reference to the following detailed descriptionwhen considered in connection with the accompanying drawings, wherein:

[0026]FIG. 1 illustrates a concept of DNA shuffling. The method for invitro evolution of genes generally called DNA shuffling consists ofrepeated cycles of two steps: random mutagenesis and DNA shuffling ofmutations. In the first step (random mutagenesis/screening), severalmutants are isolated from a target gene. In the second step, the mutantgenes are segmented (usually by DNase I), then reassembled by PCRwithout primer. The reassembled molecules are further amplified by PCRwith appropriate primers. DNAs created by DNA shuffling are used totransform a recipient (usually Escherichia coli), and transformedstrains are selected for desired phenotypes. Instead of the DNase Isegmentation, random-priming synthesis of short fragments can be used.

[0027]FIG. 2 illustrates formation of homoduplex and heteroduplex fromtwo parental polynucleotide molecules whose sequences are similar. Whentwo parental polynucleotide molecules whose sequences are similar butnot identical to each other are denatured and hybridized, two types ofhybridized molecules are formed: one strand of polynucleotide moleculehybridized with a complementary strand derived from the same parent(homoduplex), and that hybridized with a complementary strand derivedfrom a different parent (heteroduplex). The velocity constant for thehomoduplex formation (k₁) may be higher than that for the heteroduplexformation (k₂).

[0028]FIG. 3 illustrates a concept of family shuffling. Contemporarygenes belonging to the same gene family are derived from a singleancestral gene after repeated introduction of mutations through thenatural divergent evolution processes. The shuffling of the family genesequences creates a library of chimeric genes from which desired chimeraare selected.

[0029]FIGS. 4A and 4B illustrates hybridization of polynucleotides andnucleotide synthesis in family shuffling.

[0030]4A: Two polynucleotide strands (e.g. DNA) of two family genes areindicated by open and close boxes. In the family shuffling using twogenes, two types of annealing occur: homoduplex (annealing of strandsderived from the same gene) and heteroduplex (annealing of strandsderived from two different genes). If the frequency of the homoduplexformation is higher than that of the heteroduplex formation, thefrequency of the regeneration of the original (parental) gene sequencesoccurs preferentially rather than the formation of chimeric genes.

[0031]4B: To prevent the homoduplex formation in the first round ofnucleotide synthesis, single-stranded polynucleotide molecules areprepared from two family genes. These single-stranded polynucleotidemolecules are fragmented by appropriate methods (e.g. by DNase Idigestion), and hybridized. In contrast to conventional methods usingdouble-stranded polynucleotide molecules, only heteroduplex moleculesare formed in this invention.

[0032]FIG. 5 illustrates chimeric structures of the single-strandedDNA-based shuffling products. This is an example where single-strandedDNAs of nahH and xy1E were shuffled, fifty shuffled clones exhibitingthe C230 activity were randomly selected, and their nucleotide sequenceswere determined as described in Example below. The sequences derivedfrom nahH and xy1E are shown by shaded and solid boxes, respectively.White boxes show the expected recombination regions, and solid trianglesshow the point mutations.

DETAILED DESCRIPTION OF THE INVENTION

[0033] The term “parental polynucleotide molecule” is used to indicate apolynucleotide molecule species which is a starting material for invitro manipulations.

[0034] The term “hybrid polynucleotide molecule” is used to indicatethat the polynucleotide molecule is constituted from sequences derivedfrom several parental polynucleotide molecules. The hybridpolynucleotide molecule may or may not contain base substitution(s)comparing to the parental sequences as mutation(s) can be introducedduring the processes forming the hybrid polynucleotide molecule.

[0035] The term “initial template” is used to indicate that thismolecule is used as a starting material for in vitro manipulations, andserved as a template for the de novo polynucleotide synthesis.

[0036] The term “complementary” is used to mean that two polynucleotidesequences is homologous (not necessarily 100% identical) enough to makea stable hybridized molecules at a temperature above 0° C.

[0037] The term “random or non-random fragmentation” means the processto generate polynucleotide molecules shorter than the length of aninitial template. The fragmentation can be attained by several means,for example, by either the digestion of a single-stranded polynucleotideby appropriate nuclease(s), a physical shearing (e.g. sonication) of apolynucleotide molecule, or random priming of a polynucleotide tosynthesize polynucleotide fragments complementary to the templatesequence.

[0038] The term “in vitro manipulations” is used to mean various invitro manipulations required for the synthesis of hybrid polynucleotidemolecules starting from two or more types of homologous polynucleotidemolecules.

[0039] The terms “homology” and “homologous” is used to indicate thatthe partial sequences of two or more polynucleotide molecules are eitherthe same or similar to each other, or highly complementary to each otherso that two complementary strands derived from two substantiallyhomologous sequences can stably hybridize each other at a temperatureabove 0° C. The term “heterology” is used to indicate the oppositemeaning.

[0040] The DNA shuffling methods (1-11, 13-15, 18) have been developedfor accelerated and directed protein evolution in vitro. The methodsgenerally consist of two steps: (i) the isolation of several improvedmutants of the gene, and (ii) the in vitro recombination of the mutantgenes followed by the selection of best progenies. In FIG. 1, thesesteps are illustrated. In the first step, a single gene is mutagenizedby one of established methods, and desired mutant genes are selected. Inthe second step, the fragments of the mutant genes are obtained e.g. byDNase I treatment or random priming, and the mixture of the fragments issubjected to PCR without primer. DNA pieces derived from differentmutants are recombined in this PCR. Next, PCR was carried out withforward and reverse primers to amplify the full-length DNA segments.

[0041] Instead to use a set of mutant derivatives as shown in FIG. 1, aset of homologous genes could be shuffled to obtain genes synthesizingimproved enzymes. This technique is called family shuffling. Familyshuffling utilizes naturally occurring nucleotide substitutions amongfamily genes as the driving force for the in vitro evolution. Apotential problem of the family shuffling is a low yield of recombinants(i.e., hybrid molecules constituted from several family gene sequences).When two parental genes of 80% nucleotide sequence identity (xy1e andnahH, see Examples) were shuffled, the frequency of hybrid formation wasless than 1% (16-17).

[0042] Thus, it is desirable to develop methods which allow for the highfrequency production of recombinant molecules of two or more familygenes. The present invention relates to the DNA shuffling methodssummarized in FIG. 1, but is different from them in whichsingle-stranded polynucleotide molecules are used as starting materialsinstead of double-stranded polynucleotide molecules. As described below,this method allowed recombination of family genes, etc. at a frequencyhigher than that in conventional DNA shuffling methods.

[0043] After the failure of the family shuffling in our hands, weconsidered possible reasons for it. For the successful generation ofrecombinant progeny of two family gene, A and B, the hybridizationbetween single-stranded DNA of A and B (heteroduplex) should be formed,on which the elongation of DNA proceeds to form a recombined molecule ofA and B (FIG. 4A). However, in this method, the hybridization between Aand A or that between B and B also occur forming homoduplex molecules(FIG. 2). If the probability of the formation of heteroduplex molecules(a A strand hybridized with a B strand) is smaller than that ofhomoduplex molecules, i.e. a A (or B) strand hybridizing with another A(or B) strand (FIG. 2), the frequency of the hybrid molecules between Aand B may be very low in the products of the family shuffling. Acomputer simulation demonstrated that this inference would be the case.

[0044] To obtain a high frequency recombination in the polynucleotideshuffling, we newly conceived to elevate the probability of theformation of heteroduplex molecules or reduce the probability of theformation of homoduplex molecules. In an attempt to increase theprobability of the heteroduplex formation in the family shuffling ofxy1E and nahH (see below, Examples), the annealing temperature in thePCR reactions was lowered. However, the majority of the shufflingproducts still had a structure of either xy1E nahH (i.e., the frequencyof the formation of xy1E-nahH hybrids was very low).

[0045] We interpreted these results as indicating that the probabilityof homologous pairing is much higher than that of heterologous pairingin the shuffling of two genes of modest homology (e.g. about 80%homology). We then attempted to decrease the probability of homoduplexformation. For the successful generation of recombinant progeny of twofamily genes, the hybridization between a single-stranded polynucleotidemolecule of one family gene and a single stranded polynucleotidemolecules of another family gene is required. However, in theconventional family shuffling methods, the hybridization of two strandsderived from the same gene would occur preferentially over thehybridization of two strands each of which is derived from differentfamily genes.

[0046] As a result of extensive studies, we newly found that use ofsingle-stranded polynucleotide for shuffling can achieve the object ofthe present invention.

[0047] The present invention used single-stranded polymucleotidesinstead of double-stranded DNA as starting materials for preparinghybrid polynucleotide molecules from the parental polynucleotides. Thus,the present invention provides a method for making libraries of hybridpolynucleotide molecules, a method for forming a mutagenizeddouble-stranded polynucleotide, a method for obtaining a chimericpolynucleotide sequence, etc., wherein single-stranded polymucleotidesinstead of double-stranded DNA are used as starting materials.

[0048] One embodiment of the method of the present invention includesthe following steps:

[0049] (i) preparation of two single-stranded polynucleotide moleculescomprising parts of sequences which are complementary to each other,

[0050] (ii) random or non-random fragmentation of them,

[0051] (iii) hybridization of the fragmented molecules followed bypolynucleotide synthesis (i.e. polynucleotide chain elongation) on thehybridized molecules,

[0052] (iv) separation of the chain elongation products (i.e.double-stranded polynucleotide molecules) into single-strandedpolynucleotide molecules (denaturation),

[0053] (v) hybridization of the resultant single-stranded polynucleotidemolecules followed by polynucleotide synthesis on the hybridizedmolecules, and

[0054] (vi) repeating at least two further cycles of step (iv) and (v)(e.g., under PCR conditions).

[0055] The present invention involves in the formation of hybridpolynucleotide molecules by mixing two parental polynucleotidesequences. The partial homology should exist between two parentalpolynucleotide sequences so that the complementary strands derived fromsome regions of the two parental polynucleotide molecules can hybridizeeach other under conditions required for the de novo polynucleotidesequences. In other words, two complementary polynucleotide sequenceshave one or more parts of homology and one or more parts of heterologyfrom each other. The required homology is dependent upon severalparameters including the G-C contents of the parental molecules. Thepresent invention is applicable even if the total homology is low, e.g.,less than about 95%, and even less than about 80%. Preferably, each ofthe homologous regions (i.e., parts of homology) should be at least15-base-long with the homology higher than 75%.

[0056] The total length of the two polynucleotide sequences are notparticularly limited. Generally, about 30 bases to about 10,000 basesare preferable, and about 100 bases to about 2,000 bases are morepreferable.

[0057] In the case where the mixing of more than two parentalpolynucleotide sequences is desired, hybrids of two parental sequencesshould be formed first, then, the single-stranded polynucleotidesequences of the hybrids should be prepared to mix with the thirdparental polynucleotide sequences.

[0058] Various methods existing for single-stranded DNA preparation canbe utilized in the present invention. As an example of one method, thegenes have to be cloned either in a phagemid vector or in asingle-stranded DNA phage such as M13, and the single-stranded DNA haveto be prepared from the filamentous phage particles. Two homologousgenes contain regions separately cloned in phagemid vectors (forexample, pBluescript) or single-stranded DNA phage vectors. Theorientations of one gene with respect to the origin of thesingle-stranded DNA replication should be the same, while that ofanother gene should be opposite with respect to the origin of thesingle-stranded DNA replication. Then, the coding strand of one gene andthe non-coding strand of another gene are synthesized and packaged inphage particles. In other words, single-stranded DNAs of the twohomologous genes are complementary to each other.

[0059] The plasmids thus constructed are introduced into “male” E. colisuch as JM109 (as the infection by single-stranded DNA phage requires Fpili). If phagemids are used, cultures of the transformants should beinfected with a helper phage such as VCS-M13 (Stratagene) or M13KO7(Pharmacia) to rescue of single-stranded DNA from the phagemids.

[0060] In supernatant of the culture, phagemid DNA packaged in phagecapsules as well as helper phage are recovered. Single-stranded DNA isthen prepared from the phage capsules reactions. The presence of thehelper phage DNA does not interfere with the following reactions.

[0061] Other methods than using phagemid/phage vectors exist for thepreparation of single-stranded DNA: asymmetric PCR is one of suchtechniques.

[0062] The single-stranded DNA obtained prepared above may contain adouble-stranded DNA as a contaminant, but the amount of thedouble-stranded DNA is preferably controlled to 30% or less, morepreferably 5% or less.

[0063] The single-stranded DNA is then fragmented by an appropriatemethod. The degree of fragmentation (i.e., length of fragmentedpolynucleotides) is not particularly limited. One skilled in the art canoptionally decide the degree of fragmentation taking into considerationthe full length of the polynucleotides to be shuffled, the GC contentthereof, the intended degree of shuffling, etc.

[0064] Examples of the method to achieve fragmentation include use ofDNase I, but other methods such as shearing by sonication can also beused. In general, fragments in the size range larger than 20 bases maybe isolated from 2-3% low-melting-point agarose gel. Thus, preferably,the fragmentation may be controlled so that the average length of thefragmented polynucleotides is from about 20 to about 500 bases. Our datashowed that the size range of about 40-200 bases is further preferablefor the mixing of two polynucleotide sequences of lower homology (e.g.,about 80% homology), but the present invention is not limited to thisrange.

[0065] If regions with low sequence homologies exist, uninterruptedpolynucleotide segments covering these heterogenous regions should beprepared as the hybridization and hence the recombination of twoparental polynucleotide molecules would not occur at a high frequency inthese regions.

[0066] Fragments of single-stranded DNA thus prepared are then used forthe reassembly of fragments and amplification of assembled fragments,the steps commonly used by conventional DNA shuffling methods. Examplesof family shuffling using single-stranded DNA were described below (seeExamples). The frequency of the chimerical gene formation usingsingle-stranded DNA was much higher than that using the original DNAshuffling methods with double-stranded DNAs.

[0067] In the single-stranded DNA-based methods using two family genes,single-stranded fragments of one gene could anneal only tosingle-stranded fragments of another gene at the 1st hybridizationcycle. In the case that a mixture of single-stranded and double-strandedpolynucleotide molecules is used for the 1^(st) hybridization cycle, (ifsingle-stranded polynucleotide molecules are prepared by asymmetric PCR,the product is the mixture of the double-stranded and single-strandedpolynucleotide molecules), the probability of the heteroduplex formationis not 100%, but it is higher than that using solely double-strandedpolynucleotide molecules. Although the elongation products could makehomoduplex at the 2nd or later cycles, the frequency of the heteroduplexformation would be higher in the shuffling using single-stranded DNAthan that in the shuffling using double-stranded DNA.

[0068] when single-stranded DNA was used as a starting materials, thefrequency of hybrid formation was 17% when two genes (xy1E and nahH)having sequence divergence of about 20%. This frequency was lower thanthat of the restriction enzyme-based DNA shuffling in which almost allshuffling products were chimerical genes (16). However, although therestriction enzyme-based DNA shuffling has many advantages, the varietyof shuffling products might be limited to a certain degree, because thevariation of gene fragments generated by restriction enzyme digestionwere limited compared to that generated by random fragmentation.

[0069] In the in vitro protein evolution, both the efficientrecombination of genes and the effective screening of recombinants arerequired. If a very effective screening method, such as antibioticresistance, is available, the desired chimeras would be obtained even ifthe recombination efficiency is low. We could cite the selection ofimproved β-lactamase as an example of easy screens. In this case, onevariant among 10⁶ progenies could easily be selected. However, for theimprovement of most industrially relevant enzymes, no selection ispossible. The enzyme properties in progenies should be determined, oneto the other, by using time-consuming assays. Under such circumstances,shuffling techniques with efficient recombination are of use even thoughthese techniques require additional manipulations.

[0070] Other detailed conditions, reagents to be used, procedures, etc.for the above-described steps such as preparation of single-strandedDNA, random or non-random fragmentation of DNA molecules, annealing(hybridization), separation of double-stranded DNA into single-strandedDNA, polynucleotide synthesis (i.e., polynucleotide chain elongation) onthe hybridized molecules, etc. can be appropriately decided by takinginto consideration those known in the art, for example, in “MolecularCloning, A Laboratory Manual” (edit. by T. Maniatis et al., Cold SpringHarbor Laboratory, 1989), U.S. Pat. No. 5,830,721, etc., both hereinincorporated by reference.

[0071] In conclusion, the present invention provides a shuffling methodusing single-stranded DNA which can increase the probability of thehybrid formation comparing with that using double-stranded DNA. Thistechnique should be especially useful for the family shuffling of anygenes.

EXAMPLE

[0072] The following results are one example of many possibleembodiments of the present invention.

[0073] (1) Family Shuffling of xy1E and nahH Using Double-Stranded DNA

[0074] The xy1E and nahH genes both encode catechol 2,3-dioxygenase(C23O). Their nucleotide sequences are approximately 80% identical. Whenwe applied the previously described family shuffling techniques toobtain hybrid genes of xy1E and nahH, the formation of chimeric geneswas severely restricted, and only the parental genes, nahH and xy1E,were generated as described below.

[0075] According to the method outlined by Zhao et al. (18), we carriedout the family shuffling between xy1E and nahH, and the products of thefamily shuffling were cloned in pBluescript SK(+). After thetransformation of E. coli cells using the ligated DNA, about 50% of thecolonies developed on the selective plated showed the C23O activity.When 50 clones showing the C23O activity were randomly selected andtheir C23O genes were sequenced, it was found that none of them werechimeric. Most of the products had the structure of xy1E (26 clones),nahH (15 clones), or their point mutants (9 clones). Fifty additionalclones were checked, however, no single chimerical gene was found. Allthose results demonstrated that the frequency of hybrid formation wasless than 1%.

[0076] Thus, preventing the formation of the original gene structureshas become quite important for successful family shuffling between nahHand xy1E.

[0077] (2) Family Shuffling Using Single-Stranded DNA

[0078] Family shuffling using single-stranded DNA were performed asdescribed in sections (3) and illustrated in FIG. 4. After thetransformation of E. coli cells, about 64% of the colonies grown onplates showed the C23O activity. When randomly selected 50 clonesexhibiting the C23O activity were analyzed for the nucleotide sequencesof their C23O genes, 7 of them (14%) were chimerical, and the other wereeither parental genes (40 clones) or their point mutants (3 clones). Thestructures of the chimerical clones were shown in FIG. 5. Two of them,hybrids 1 and 4, had the same nucleotide sequences.

[0079] The frequency of the chimerical gene formation usingsingle-stranded DNA was much higher than that using the original sexualPCR method with double-stranded DNAs. Most likely, the probability ofthe homo-duplex formation in the annealing step of 1st PCR cycle usingdouble-stranded gene fragments may be much higher than that of thehetero-duplex formation because of the nahH and xy1E sequence divergenceof 20%. On the other hand, when both the single-stranded nahH and thesingle-stranded xy1E were digested by DNase I, nahH gene fragments couldanneal only to fragmented xy1E at the 1st PCR cycle. Although theelongation products could make homo-duplex at the 2nd or later cycles,the frequency of the hetero-duplex formation was expected to increase inthe shuffling using single-stranded DNA.

[0080] The only difference between the methods using single-stranded DNAand double stranded DNA is the template DNA for fragmentation usingDNase I or other appropriate methods. For single-stranded DNApreparation, the genes have to be cloned either in a phagemid vector orin a single-stranded DNA phage such as M13, and the single-stranded DNAhave to be prepared from the filamentous phage particles. Although theshuffling using single-stranded DNAs needs one more step, it isworthwhile since nahH-xy1E chimerical genes were formed much moreefficiently than in the shuffling using double stranded DNA.

[0081] (2) Detailed Description of One of Many Possible Protocols of theInvention

[0082] (2-1) Preparation of ssDNAs

[0083] 1. Clone two homologous genes separately in phagemid vectors (forexample, pBluescript). The orientations of one gene with respect to theorigin of the single-stranded DNA replication should be the same, whilethat of another gene should be opposite with respect to the origin ofthe single-stranded DNA replication. Then, the coding strand of one geneand the non-coding strand of another gene are synthesized and packagedin phage particles. In other words, single-stranded DNAs of the twohomologous genes are complementary to each other.

[0084] 2. Introduce the plasmids thus constructed into “male” E. colisuch as JM109 (as the infection by single-stranded DNA phage requires Fpili).

[0085] 3. Inoculate a 1 ml culture of 2×YT medium supplemented with anappropriate antibiotics with a single JM109 transformant colony.

[0086] 4. Grow the culture at 37° C. with shaking (e.g. rotary shakingat 250 rpm) to an optical density at 600 nm of 2.

[0087] 5. Inoculate 25 ml of 2×YT containing an appropriate antibioticsin 250 ml Erlenmeyer flask with 0.5 ml of the above culture.

[0088] 6. Incubate for 1 h with shaking at 37° C.

[0089] 7. Infect with a helper phage at an m.o.i. (multiplicity ofinfection) of 10-20. Helper phages such as VCS-M13 (Stratagene) andM13KO7 (Pharmacia) are used for the rescue of ssDNA from phagemids.These helper phages can be prepared by propagating on a male E. colistrain.

[0090] 8. Incubate overnight with shaking at 37° C.

[0091] 9. Next day, harvest the cells by centrifuging at 12,000×g at 4°C. for 15 min.

[0092] 10. Transfer the supernatant (containing phage particles) to afresh tube. Do not introduce the pellet in the tube.

[0093] 11. Spin the supernatant again, transfer the supernatant to afresh tube.

[0094] 12. Add 0.25 volume of phage precipitation solution to thesupernatant. Leave on ice for at least 1 h, or overnight at 4° C.

[0095] 13. Centrifuge at 12,000×g for 20 min at 4° C. In the presence ofPEG, phage particles precipitate.

[0096] 14. Remove the supernatant and resuspend the pellet in 400 ml ofTE buffer, and transfer to a 1.5-ml tube.

[0097] 15. Add one volume of TE-saturated phenol:chloroform:isoamylalcohol (25:24:1) to the sample, vortex at least 1 min and centrifuge at12,000×g for 5 min.

[0098] 16. Transfer the upper phase (containing phagemid DNA) to a freshtube without disturbing the interface. Repeat the organic solventextraction until no visible material appears at the interface.

[0099] 17. Add 0.5 volume (200 ml) of 7.5 M ammonium acetate plus twovolumes (1.2 ml) of 100% ethanol. Mix and leave at −20° C. for 30 min toprecipitate the phagemid DNA.

[0100] 18. Centrifuge at 12,000×g for 5 min, remove the supernatant andcarefully rinse the pellet with ice-cold 70% ethanol. If the pellet isdisturbed, centrifuge again for 2 min. Drain the tube and dry the pelletunder vacuum.

[0101] 19. By agarose gel electrophoresis, two major bands correspondingto helper phage DNA and single-stranded DNA from phagemid are usuallyseen. A small amount of chromosomal DNA and RNA released by cell lysismay be present. Longer incubation during phagemid rescue increases thecontamination of E. coli chromosomal DNA. Single-stranded DNA migratesfaster than double stranded (ds) DNA of the same length. The presence ofthe helper phage DNA does not interfere with the following reactions.

[0102] (2-2) DNase I Treatment

[0103] 1. Dilute 2 μg (or 3 pmol) of each DNA in 45 μl of TE buffer, andadd 5 μl of 10× DNase I digestion buffer.

[0104] 2. Incubate the solution at 15° C. for 5 min, then add 0.3 U ofDNase I.

[0105] 3. Incubate further for 2 min, then transfer to 90° C. andincubate for 10 min to terminate the reaction. The concentration ofDNase I, the time and temperature for the reaction should be optimizedas the digestion speed of DNase I is influenced by many factorsincluding the nature of DNA. After DNase I treatment and subsequentincubation at 90° C., the brown precipitation may be produced. This isremoved by centrifugation and then the supernatant is used for PCR.

[0106] 4. Fragments in the size range of 40-100 bases are isolated from2-3% low-melting- point agarose gel using QIAEX II gel extraction kit.

[0107] 5. Resuspend the recovered DNA in 50 μl of TE buffer.

[0108] (2-3) PCR Reassembly and Amplification

[0109] 1. Add to a 1.5 ml tube, 10 μl of 10× Pfu DNA polymerase buffer,10 μl of DNTP mix, 10 μl of the purified fragments, 2.5 U of Pfu TurboDNA polymerase and H₂O to the total volume of 100 μl.

[0110] 2. Perform PCR: 40 cycles of denaturation at 94° C. for 1 min,annealing at 56° C. for 1 min and elongation at 72° C. for 60+5sec/cycle (the duration of elongation is increased by 5 sec after eachcycle). This PCR amplification becomes difficult when the sizes oftarget genes are large, or the mean size of DNase I-cleave DNA is small.If the yield of the PCR products is low, the number of cycles in thisPCR step should be increased.

[0111] 3. Add to a new 1.5 ml tube, 1-5 μl of reassembled PCR product,10 μl of 10× Pfu DNA polymerase buffer, 10 μl of dNTP mix, 2 μl offorward and reverse primers, 2.5 units of the Taq/Pfu DNA polymerasemixture and H₂O to the total volume of 100 μl.

[0112] 4. Perform PCR: 96° C. for 2 min followed by 25 cycles of 94° C.for 30 sec, 58° C. for 30 sec and 72° C. for 45+20 sec/cycle. This PCRamplification using primers seems to be inhibited when theconcentrations of reassembled PCR products (i.e. the product of thefirst PCR without primers: step 2) are high. Check the concentrations ofthe first (step 2) and second (this step) PCR products on an agarosegel. If the second PCR amplification is not efficient, reduce theconcentration of substrates (i.e. the product of the first PCR) in thesecond PCR.

[0113] 5. Separate the PCR products on an agarose gel and recover a bandcorresponding to the size of the full length gene.

[0114] 6. Purify the PCR products, subclone them and select desiredmutants among transformants.

[0115] (2-4) Isolation of Thermally Stable C23Os

[0116] Thermally stable clones were screened from 750 colonies obtainedby single-stranded DNA-based shuffling. After the treatment at 65° C.for 10 min, under which Xy1E (the xy1E gene product, namely C23Oproduced from xy1E) and NahH (the xy1E gene product, namely C23Oproduced form nahH) were inactivated, 10 colonies exhibited the residualC23O activity, showing that they contain enzymes thermally more stablethan the wild-type C23Os. The amino acid sequences of the thermallystable C23Os are shown in FIG. 6. Although the nucleotide sequences ofall 10 clones were different, the deduced amino acid sequences of clones120 and 942, clones 202 and 450, and clones 315 and 1527 were identical.

[0117] (2-5) Notes Concerning DNA Polymerases

[0118] Primer design is crucial in DNA shuffling. Primer length andsequence should carefully be determined for a successful amplification,cloning and gene expression. Follow general guidelines on the primerdesign for PCR cloning that are described in many textbooks. Sequencesflanking the target gene can also be used for the primer design.

[0119] Taq DNA polymerase adds a single 3′-dA overhang to a blunt dsDNAtemplate. For the cloning of such PCR products, linearized plasmidvectors containing single dT overhangs (T-vectors) were developed. Thecloning efficiencies of the T-vectors, however, were variable. DNApolymerases with 3′→5′ exonuclease (proofreading) activity removemispaired nucleotides from 3′ ends of dsDNA and generate blunt-end PCRproducts. The PCR products generated by these proofreading polymerasesthus can be cloned into vectors by blunt-end ligation. However,blunt-end cloning of PCR products is less efficient than sticky-endcloning. Thus, it is advisable to introduce additional restriction sitesat the 5′ end of each of the primers. As amplification proceeds, theseprimers are incorporated into the PCR product. Thus, the PCR productscan be digested by appropriate restriction enzymes to clone in anappropriate vector.

[0120] Taq polymerase lacks a 3′→5′ exonuclease activity. In otherwords, it lacks the proofreading function of DNA replication, and thusexhibits a high rate of replication errors. This enzyme shows higherreplication errors when the concentration of MnCl₂ in the reactionmixture increased, or when the concentration of one nucleotide was lowerthan those of other three nucleotides in the reaction mixture. By thisreason, PCR with Taq polymerase was often used for the mutagenesis ofgenes. Pfu and Pfu Turbo DNA polymerases, on the other hand, show highfidelity of DNA replication. Zhao et al. (18) have reported that theusing a proofreading DNA polymerase (Pfu or Pwo) in the reassembly stepincreased the frequency of active clones from the shuffling products.Therefor PCR with the proofreading DNA polymerase (Pfu Turbo) is usedhere to reduce the negative mutation.

[0121] The amplification by Pfu Turbo polymerase is not as powerful asthat by Taq DNA polymerase. The replacement of Pfu Turbo polymerase byTaq DNA polymerase or by the mixture of Pfu Turbo and Taq polymerasesoften provides better results.

[0122] References

[0123] The references cited above are shown below, the entire contentsof each being hereby incorporated by reference.

[0124] 1. Stemmer, W. P. C. (1994) DNA shuffling by random fragmentationand reassembly: in vitro recombination for molecular evolution. Proc.Natl. Acad. Sci. U.S.A., 91, 10747-10751.

[0125] 2. Stemmer, W. P. C. (1994) Rapid evolution of a protein in vitroby DNA shuffling. Nature (London), 370, 389-391.

[0126] 3. Shao, Z., Zhao, H., Giver, L., and Arnold, F. H. (1998)Random-priming in vitro recombination: an effective tool for directedevolution. Nucleic Acids Res., 26, 681-683.

[0127] 4. Zhao, H., Giver, L., Shao, Z., Affholter, J. A., and Arnold,F. H. (1998) Molecular evolution by staggered extension process (StEP)in vitro recombination. Nat. Biotechnol., 16, 258-261.

[0128] 5. Crameri, A., Whitehorn, E. A., Tate, E., and Stemmer, W. P. C.(1996) Improved green fluorescent protein by molecular evolution usingDNA shuffling. Nat. Biotechnol., 14, 315-319.

[0129] 6. Crameri, A., Dawes, G., Rodriguez, E., Jr., Silver, S., andStemmer, W. P. C. (1997) Molecular evolution of an arsenatedetoxification pathway by DNA shuffling. Nat. Biotechnol., 15, 436-438.

[0130] 7. Moore, J. C. et al. (1997) Strategies for the in vitroevolution of protein function: Enzyme evolution by random recombinationof improved sequences. J. Mol. Biol. 272, 336-347.

[0131] 8. Yano, T., Oue, S., and Kagamiyama, H. (1998) Directedevolution of an aspartate aminotransferase with new substratespecificities. Proc. Natl. Acad. Sci. U.S.A., 95, 5511-5515.

[0132] 9. Zhang, J. H., Dawes, G., and Stemmer, W. P. C. (1997) Directedevolution of a fucosidase from a galactosidase by DNA shuffling andscreening. Proc. Natl. Acad. Sci. U.S.A., 94, 4504-4509.

[0133] 10. Patten, P. A., Howard, R. J., and Stemmer, W. P. (1997)Applications of DNA shuffling to pharmaceuticals and vaccines. Curr.Opin. Biotechnol., 8, 724-733.

[0134] 11. Crameri, A., Raillard, S. A., Bermudez, E., and Stemmer, W.P. C. (1998) DNA shuffling of a family of genes from diverse speciesaccelerates directed evolution. Nature (London), 391, 288-291.

[0135] 12. Harayama, S. (1998) Artificial evolution by DNA shuffling.Trends Biotechnol., 16, 76-82.

[0136] 13. Kumamaru, T., Suenaga, H., Mitsuoka, M., Watanabe, T., andFurukawa, K. (1998) Enhanced degradation of polychlorinated biphenyls bydirected evolution of biphenyl dioxygenase. Nat. Biotechnol., 16,663-666.

[0137] 14. Chang, C. -C., Chen, T. T., Cox, B. W., Dawes, G. N.,Stemmer, W. P. C., Punnonen, J., and Patten, P. A. (1999) Evolution of acytokine using DNA family shuffling. Nat. Biotechnol., 17, 793-797.

[0138] 15. Hansson, L. O., B-Grob, R., Massoud, T., and Mannervik, B.(1999) Evolution of differential substrate specificities in Mu classglutathione transferases probed by DNA shuffling. J. Mol. Biol., 287,265-276.

[0139] 16. Kikuchi, M., Ohnishi, K., and Harayama, S. (1999) Novelfamily shuffling methods for the in vitro evolution of enzymes. Gene,236, 159-167.

[0140] 17. Kikuchi, M., Ohnishi, K., and Harayama, S. (2000) Aneffective family shuffling method using single-stranded DNA. Gene, 243,133-137.

[0141] 18. Zhao, H., and Arnold, F. H. (1997) Optimization of DNAshuffling for high fidelity recombination. Nucleic Acids Res., 25,1307-1308.

[0142] While the invention has been described in detail and withreference to specific embodiments thereof, it will be apparent to oneskilled in the art that various changes and modifications can be madetherein without departing from the spirit and scope thereof.

What is claimed is:
 1. A method for making libraries of hybridpolynucleotide molecules in which double-stranded polynucleotidemolecules are not used as starting materials.
 2. The method of claim 1,wherein two types of single-stranded polynucleotide molecules are usedas starting materials and wherein the first-type molecule comprisesstretches of sequences containing one or more parts of homology and oneor more parts of heterology to the complementary sequence of thesecond-type molecule.
 3. The method of claim 2, wherein thesingle-stranded polynucleotide molecules are fragmented and used astemplates for de novo polynucleotide synthesis to create hybridpolynucleotide molecules.
 4. The method of claim 2, wherein mutationsare introduced into hybrid polynucleotide molecules prior, during orafter the production of the hybrid polynucleotide molecules.
 5. A methodfor making libraries of hybrid polynucleotide molecules, whichcomprises: (i) preparing two single-stranded polynucleotide moleculescomprising sequences which are complementary to each other, (ii)randomly or non-randomly fragmenting the two single-strandedpolynucleotide molecules, (iii) incubating the fragmented moleculesunder conditions such that hybridization of fragmented polynucleotidemolecules occurs and de novo polynucleotide synthesis on the hybridizedmolecules occurs, (iv) denaturing the resultant elongateddouble-stranded polynucleotide molecules into single-strandedpolynucleotide molecules, (v) incubating the resultant single-strandedpolynucleotide molecules under conditions such that hybridization ofsingle-stranded polynucleotide molecules occurs and de novopolynucleotide synthesis on the hybridized molecules occurs, and (vi)repeating at least two further cycles of steps (iv) and (v).